Dna encoding protein and methods of using same

ABSTRACT

The present invention relates to novel tools for improving MPA production. In particular, the present invention relates to fungal enzymes that are specific for MPA synthesis.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 12/663,773, filed on Apr. 1, 2010, which claims the benefit and priority to and is a U.S. National Phase Application of PCT International Application Number PCT/DK2008/050138, filed on Jun. 12, 2008, designating the United States of America and published in the English language, which is an International Application of and claims the benefit of priority to European Patent Application No. EP 07110287.5, filed on Jun. 14, 2007, and U.S. Provisional Application No. 60/943,932, filed on Jun. 14, 2007. The disclosures of the above-referenced applications are hereby expressly incorporated by reference in their entireties.

SEQUENCE LISTING IN ELECTRONIC FORMAT

The present application is being filed along with a Sequence Listing in electronic format. The sequence listing is provided as a file entitled PLOUG39.004C1, created Mar. 12, 2013 which is 74 KB in size. The information in the electronic format of the sequence listing is incorporated herein by reference in its entirety.

TECHNICAL FIELD OF THE INVENTION

The present invention relates to the field of fungal secondary metabolites. In particular the present invention relates to enzymes involved in the synthesis of mycophenolic acid (MPA).

BACKGROUND OF THE INVENTION

Mycophenolic acid (MPA) is a natural compound produced by some fungi, mainly of the Pencicillum fungus species. MPA has a wealth of applications; the most important application at present being a key drug in the treatment of organ transplanted patients. MPA was first discovered in 1893 and has been investigated thoroughly since its discovery. However, despite the importance of this drug, no information is available about the enzymes responsible for MPA synthesis in the fungus. On an industrial scale, MPA is thus currently produced by relatively laborious and inefficient fermentation processes of the natural fungus, primarily Penicillium brevicompactum.

Hence, there exists a need in the art for improved methods for producing MPA. Furthermore, it is likely that new commercial applications of MPA, and thereby an increased demand for the compound, will result from cheaper and more efficient production methods.

SUMMARY OF THE INVENTION

Thus, an object of the present invention relates to the isolation of the genes encoding the enzymes involved in the production of MPA.

In a first aspect, the present invention thus relates to an expression vector comprising at least one polynucleotide sequence encoding an polypeptide, wherein said polypeptide is selected from the group consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, and SEQ ID NO: 8, and wherein said polypeptide(s) have a sequence identity of at least 70% with the sequence(s) set forth in SEQ ID NOs 1-5, and 7-8, and wherein said polypeptide has a sequence identity of at least 90% with the sequence set forth in SEQ ID NO: 6. SEQ ID NOs 1-8 encode enzymes involved in the MPA synthesis in the P. brevicompactum fungus.

In further aspects, the present invention relates to host cells comprising the vector according to the invention as well as methods for cultivating such host cells in order to produce MPA.

In yet further aspects, the present invention relates to:

-   -   i) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 1,     -   ii) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 2,     -   iii) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 3,     -   iv) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 4,     -   v) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 5,     -   vi) an isolated polynucleotide sequence encoding a polypeptide         with 90-100% identity with SEQ ID NO: 6,     -   vii) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 7,     -   viii) an isolated polynucleotide sequence encoding a polypeptide         with 80-100% identity with SEQ ID NO: 8, and     -   ix) a polypeptide encoded by any one of these polynucleotide         sequences.

A final aspect relates to use of host cells according to the present invention for production of MPA.

DETAILED DESCRIPTION OF THE INVENTION

The Italian physician, Bartolomeo Gosio discovered the antibiotic effect of mycophenolic acid (MPA) in 1893 by observing that the anthrax bacillus was inhibited by one of his purified fungal metabolites from Penicillium brevicompactum. Interestingly, MPA was thereby the first antibiotic to be crystallised from a living organism, and since Gosio's discovery more than 100 years ago, MPA has turned out to be a “miracle drug”. It has been used as an immunosuppressant in kidney, heart and liver transplantations and has been reported to possess antiviral, antifungal, antibacterial, antitumor, and anti-psoriasis activities.

Analyses by Birch et al. 1957 showed that MPA belongs to the group of compounds named meroterpenoids. Meroterpenoids are compounds which consist of a polyketide fused to a mevalonate pathway intermediate. MPA consists of a polyketide fused to farnesyl diphosphate, the latter being derived from the mevalonate pathway. Thus two distinct pathways are involved in the production of MPA.

Several Penicillium spp. are capable of producing MPA, and due to its fundamental biological activities great interest has been dedicated to the elucidation of the structure, the biosynthesis and the mechanism behind its promising biological properties. Fungal production of MPA has been shown in the following Penicillium species: P. brevicompactum, P. stoloniferum, P. scabrum, P. nagemi, P. szaferi, P. patris-mei, P. griscobrunneum, P. viridicatum, P. carneum, P. arenicola, P. echinulatum, P. verrucosum, and P. brunneo-stoloniferum. In addition, the fungus Byssochlamys nivea has also been reported to produce MPA.

Even though it is known that it is more than likely that a polyketide synthase (PKS) is involved in the MPA synthesis and even though most PKS proteins share conserved regions, it was not possible to design PKS primers that allowed cloning of MPA PKS in P. brevicompactum. The explanation most likely being that the structural diversity produced by fungal PKSs is enormous combined with the fact that the fungus furthermore encodes a large number of different PKS enzymes responsible for production of a large number of polyketides (MPA being a polyketide). Furthermore, the structure of MPA indicates that the MPA PKS should belong to a non-reducing type with methyl transferase activity, which thus far did not contain any characterized PKS enzymes. There was therefore reason to believe that the MPA PKS would differ in sequence from other known PKS enzymes.

The structure of MPA (formula I) is shown below:

The IUPAC name of MPA is: (E)-6-(4-hydroxy-6-methoxy-7-methyl-3-oxo-1,3-dihydroisobenzofuran-5-yl)-4-methylhex-4-enoic acid.

MPA inhibits Inosine Monophosphate Dehydrogenase (IMPDH) (EC 1.1.1.205). IMPDH is an important enzyme in the de novo biosynthesis of GMP, catalyzing the nicotinamide adenine dinucleotide (NAD) dependent oxidation of IMP to xanthosine-5-monophosphate (XMP). Since GMP is one of the building blocks of DNA, IMP dehydrogenase is an obvious target for drugs intended for DNA biosynthesis inhibition, such as anti-cancer agents. There are two GMP producing pathways: (i) the “de novo pathway”, where IMP is a key-intermediate; and (ii) the “salvage pathway” in which free purines are formed in catabolic processes and reconverted to nucleoside monophosphates by reacting with 5-phospho-α-D-ribofuranosyl diphosphate.

MPA inhibits the proliferation of lymphocytes, because they are almost entirely dependent on the de novo GMP biosynthesis pathway. Cancer cell lines are however, less sensitive to MPA as they are capable of obtaining GMP via both the de novo pathway and the salvage pathway.

IMPDH proteins from approximately 125 different organisms have thus far been isolated and they show a high degree of similarity. Some organisms contain more than one gene encoding putative IMPDH proteins. Unpublished blast searches performed by the inventors in connection with the present invention revealed that fungal genomes closely related to P. brevicompactum (Aspergillus oryzae, Aspergillus terreus, Magnaporthe grisea and Neurospora crassa) contain only one copy of the IMPDH gene. No IMPDH sequences from P. brevicompactum have thus far been reported.

It has previously been shown that an MPA resistant strain of Candida albicans is resistant to high titers of MPA due to over expression of the IMPDH gene. In connection with the present invention it was a crucial step to realize that a similar natural mechanism in P. brevicompactum is rendering this fungus MPA resistant—the P. brevicompactum genome encodes two IMPDH genes.

It was presumed that the enzymes responsible for MPA synthesis would be present in a gene cluster in the genome of the P. brevicompactum fungus since it has previously been reported that many naturally occurring polyketides are produced by enzymes that are all present within a specific gene cluster.

The inventors succeeded in identifying the MPA biosynthesis gene cluster in P. brevicompactum by screening the genome for IMPDH genes by the use of a BAC library as described in the Examples.

In a first aspect, the present invention thus relates to an expression vector comprising at least one polynucleotide sequence encoding a polypeptide, wherein said polypeptide is selected from the group consisting of: SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, and SEQ ID NO: 8, and wherein said polypeptide(s) have a sequence identity of at least 70% with the sequence(s) set forth in SEQ ID NOs 1-5, and 7-8, and wherein said polypeptide has a sequence identity of at least 90% with the sequence set forth in SEQ ID NO: 6. SEQ ID NOs 1-8 encode enzymes involved in the MPA synthesis in the P. brevicompactum fungus. In the fungus, the genes encoding these eight polypeptides are present in a gene cluster.

It is understood that the term “an expression vector” also covers the situation where the selected sequences are inserted into two or more expression vectors.

In a preferred embodiment, one or more expression vectors encode at least two of the said polypeptides, more preferably at least three, even more preferably at least four, even more preferably at least five, even more preferably at least six, even more preferably at least seven, and most preferably eight polypeptides.

Likewise, the present invention relates to host cells comprising such vectors. The host cell may be any cell that can be grown in culture such as bacteria, mammalian cells, fungal cells, plant cells, etc. However, as it appears that some polypeptides are post-translationally processed, it is preferred to use eukaryotic host cells. It is even more preferred to use fungal cells such as e.g. a yeast cell or a fungus that naturally produces MPA. Yeast cells have the advantage of being relatively easy to ferment in a large scale and yeasts may thus be a practical host cell for many applications.

It follows that the invention furthermore relates to a method of cultivating a host cell according to the present invention, wherein said method comprises growing the cell in a growth media under appropriate conditions. In a preferred embodiment, the method further comprises the step of recovering and optionally purifying MPA.

In yet further aspects, the present invention relates to:

-   -   i) An isolated polynucleotide sequence encoding a polypeptide         with 70-100%, preferably 80-100, and most preferably 90-100%         identity with SEQ ID NO: 1. SEQ ID NO: 1 corresponds to the         polypeptide encoded by mpaA. mpaA encodes a polypeptide with the         characteristics of a prenyl transferase. In a preferred         embodiment, the conserved areas in the encoded polypeptide have         a degree of identity of at least 80%, preferably at least 90%,         and most preferably at least 95% identity with the corresponding         conserved areas in SEQ ID NO: 1.     -   ii) An isolated polynucleotide sequence encoding a polypeptide         with 70-100% identity, preferably 80-100%, and most preferably         90-100% identity with SEQ ID NO: 2. SEQ ID NO: 2 corresponds to         the polypeptide encoded by mpaB and which is a polypeptide with         unknown activity, but it is most likely involved in MPA         biosynthesis.     -   iii) An isolated polynucleotide sequence encoding a polypeptide         with 70-100%, preferably 80-100, and most preferably 90-100%         identity with SEQ ID NO: 3. SEQ ID NO: 3 corresponds to the         polypeptide encoded by mpaC—a putative polyketide synthase         (PKS). In a preferred embodiment, the conserved areas in the         encoded polypeptide have a degree of identity of at least 80%,         preferably at least 90%, and most preferably at least 95%         identity with the corresponding conserved areas in SEQ ID NO: 3.     -   iv) An isolated polynucleotide sequence encoding a polypeptide         with 70-100%, preferably 80-100, and most preferably 90-100%         identity with SEQ ID NO: 4. SEQ ID NO: 4 corresponds to the         polypeptide encoded by mpaD—a putative p450 monooxygenase). In a         preferred embodiment, the conserved areas in the encoded         polypeptide have a degree of identity of at least 80%,         preferably at least 90%, and most preferably at least 95%         identity with the corresponding conserved areas in SEQ ID NO: 4.     -   v) An isolated polynucleotide sequence encoding a polypeptide         with 70-100%, preferably 80-100%, and most preferably 90-100%         identity with SEQ ID NO: 5. SEQ ID NO: 5 corresponds to the         polypeptide encoded by mpaE—a putative Zn dependent hydrolase.         In a preferred embodiment, the conserved areas in the encoded         polypeptide have a degree of identity of at least 80%,         preferably at least 90%, and most preferably at least 95%         identity with the corresponding conserved areas in SEQ ID NO: 5.     -   vi) An isolated polynucleotide sequence encoding a polypeptide         with 90-100%, preferably 95-100% identity with SEQ ID NO: 6. SEQ         ID NO: 6 corresponds to the polypeptide encoded by mpaF—a         putative IMPDH. In a preferred embodiment, the conserved areas         in the encoded polypeptide have a degree of identity of at least         90% preferably at least 95% identity with the corresponding         conserved areas in SEQ ID NO: 6.     -   vii) An isolated polynucleotide sequence encoding a polypeptide         with 70-100%, preferably 80-100%, and most preferably 90-100%         identity with SEQ ID NO: 7. SEQ ID NO: 7 corresponds to mpaG—a         putative O-methyltransferase. In a preferred embodiment, the         conserved areas in the encoded polypeptide have a degree of         identity of at least 80%, preferably at least 90%, most         preferably at least 95% identity with the corresponding         conserved areas in SEQ ID NO: 7.     -   viii) An isolated polynucleotide sequence encoding a polypeptide         with 70-100%, preferably 80-100%, and most preferably 90-100%         identity with SEQ ID NO: 8. SEQ ID NO: 8 corresponds to mpaH—a         putative hydrolase. In a preferred embodiment, the conserved         areas in the encoded polypeptide have a degree of identity of at         least 80%, preferably at least 90%, most preferably at least 95%         identity with the corresponding conserved areas in SEQ ID NO: 8.

It follows that the present invention furthermore relates to polypeptides encoded by any one of these polynucleotide sequences. Furthermore, the polypeptide may be a fragment thereof, wherein said fragment has a length of at least 100, preferably 150, more preferably 200, more preferably 250, and most preferably 300 amino acids.

Finally, the invention relates to the use of a host cell according to the invention for production of MPA.

It should be noted that embodiments and features described in the context of one of the aspects of the present invention also apply to the other aspects of the invention.

The invention will now be described in further details in the following non-limiting examples.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1: The MPA biosynthesis gene cluster in P. brevicompactum. The gene cluster is flanked by a 4 kb and a 7 kb region with no similarity to any known sequences. These regions are therefore thought to present natural boundaries for the gene cluster. The physical map of the BACs overlapping the cluster is shown. The block arrows indicate the putative genes and their direction of transcription into mRNA based on sequence analysis and homology searches. Genes with domains that corresponded well to required enzymatic activities for MPA biosynthesis are designated mpaA-mpaH. X1-X4: XbaI restriction sites. The X3 site is located in the pECBAC1 cloning vector and is thus not part of the P. brevicompactum genomic DNA insert. Bold line around X2: region that hybridized with the IMPDH gene probe.

FIG. 2: Analysis of P. brevicompactum MpaC (polyketide synthase) (SEQ ID NO: 3) for the presence of conserved domains using the Conserved Domain Database (CDD) at the National Center for Biotechnology Information (NCBI). KS: beta-ketoacyl synthase. AT: acyltransferase. PP: phosphopantetheine attachment site. MT: Methyltransferase. Esterase: esterase domain similar to Aes of E. coli. Gaps indicate predicted introns.

FIG. 3: Alignment of MT domains from various polyketide synthases (MlcA, MlcB, LNKS, LDKS, and mpaC). mpaC (SEQ ID NO: 3) from the P. brevicompactum MPA biosynthesis gene cluster contains three conserved motifs (Motif I in MpaC: ILEIGAGTG (SEQ ID NO: 33); motif II in MpaC: GQYDIVLS (SEQ ID NO: 34); motif III in MpaC: LLRPDGILC (SEQ ID NO: 35)). These motifs are known to be present in most PKS MT domains. The presence of an MT domain is consistent with the fact that methylation occurs at the tetraketide stage of the MPA biosynthesis.

FIG. 4: Illustration of the similarities between the UbiA catalyzed prenylation reaction from Escherichia coli and the MpaA catalyzed reaction from P. brevicompactum. Hydroxyl groups function as ortho-para directing activators for the alkylation reaction. For MPA this means that the C-6 is highly activated because of the two neighbouring hydroxyl groups.

FIG. 5: Illustration of the Phobius-predicted transmembrane helices in P. brevicompactum MpaA (prenyl transferase) (SEQ ID NO: 1). Seven transmembrane regions were identified, and the prenyl transferase consensus pattern was found between the second and third transmembrane segments as indicated with a filled circle on loop between transmembrane region two and three. L#: number of amino acid residues in each loop.

FIG. 6: Alignment of P. brevicompactum MpaG to related O-methyltransferase proteins.

FIG. 7: Biosynthesis of MPA in P. brevicompactum. The putative enzymes were identified in this study and are assigned to reaction steps requiring enzymatic activities that match the predicted functions of the enzymes. Each step of the biosynthesis is numbered and used for reference in the text.

FIG. 8: Schematic representation of the bipartite gene targeting method. Grey arrows (→) represent primers used to construct gene targeting substrates.

FIG. 9: The following abbreviations are used in the figure: WT, the wild-type strain IBT23078; PB-pAN7-1, IBT23078 transformed with pAN7-1 plasmid; MPA1-1, MPA1-2, MPA1-3 and MPA1-8, IBT 23078 transformed with bipartite substrates.

PCR analysis results amplified from genomic DNA of the wild-type and some transformants. A) Amplified upstream mpaC and upstream 2/3 HygR cassette using primers KO-MpaC-F1 (SEQ ID NO: 29) and Upst-HygR-N (SEQ ID NO: 26). B) Amplified downstream 2/3 HygR cassette and downstream mpaC using primers Dwst-HygF-N (SEQ ID NO: 27) and KO-MpaC-Re3 (SEQ ID NO: 30). The PCR product size expected from the deletion stains for A) and B) are 4.5 and 4.4 kb, respectively. For the wild-type or transformants carrying non-homologous recombination, no PCR product is expected. C) Amplified 1/3 of mpaC gene using primers KO-2 mpaC-UF (SEQ ID NO: 31) and KO-2 mpaC-URa (SEQ ID NO: 32). The expected PCR product for the wild-type strain is 2.6 kb, whereas no PCR product is expected for the deletion strains.

FIG. 10: The following abbreviations are used in the figure: WT, the wild-type strain IBT23078; PB-pAN7-1, IBT23078 transformed with pAN7-1 plasmid; MPA1-1, MPA1-3 and MPA1-8, mpaC deletion strains; MPA1-2, IBT23078 contained random integrated of HygB cassette.

HPLC profiles of the reference and some transformants. All strains are grown on YES agar at 25° C. for 5 days. All chromatogram are illustrated at the same scale.

FIG. 11: The following abbreviations are used in the figure: WT, the wild-type strain IBT23078; PB-pAN7-1, transformant contained pAN7-1 plasmid; MPA1- and MPA2-series, transformants derived by bipartite method.

Mycophenolic acid production by wild-type and transformants grown on YES agar at 25° C. for 5 days. Data represents the relative amount of mycophenolic acid produced by transformants compared to the wild-type.

DEFINITIONS

Prior to discussing the present invention in further details, the following terms and conventions will first be defined:

Polyketides: Polyketides are secondary metabolites from bacteria, fungi, plants, and animals. Polyketides are derived from the polymerization of acetyl and propionyl subunits in a similar process to fatty acid synthesis catalyzed by polyketide synthases (PKSs). Polyketides also serve as building blocks for a broad range of natural products. Polyketides are structurally a very diverse family of natural products with an extremely broad range of biological activities and pharmacological properties. Polyketide antibiotics, antifungals, cytostatics, anticholesterolemics, antiparasitics, coccidiostatics, animal growth promotants and natural insecticides are in commercial use. MPA is classified as a polyketide with an attached farnesyl side chain—an intermediate from the mevalonate pathway (MPA may furthermore be classified as a meroterpenoid). Other examples of polyketides of great commercial and therpeutical interest are the cholesterol lowering statins such as e.g. lovastatin, atorvastatin, etc. Many naturally occurring polyketides are produced by enzymes that are all present within a specific gene cluster.

Gene cluster: The term “gene cluster” indicates that a specific number of genes involved in a biosynthetic pathway are localized closely to each other in the genome and that there is a first gene and a last gene that define the physical outer boundaries of the cluster.

Growth medium: The growth medium may be solid, semi-solid or liquid and preferably contains an energy source as well as the required minerals (P, K, S, N, etc.).

Suitable incubation conditions: Preferred incubation conditions may vary depending on the host cell system. Some host cells may prefer mainly anaerobic conditions and other may prefer mainly aerobic conditions. All host cell systems prefer moist conditions, i.e. a water content in the media from 5-99%, preferably 10-90%, more preferably 20-80%, more preferably 30-70%, and most preferably 50-60%. Many host cell systems furthermore require continuous shaking. The incubation time may vary from less than 1 day to about a month, preferably 2-20 days, more preferably 4-15 days and most preferably 1-2 weeks.

Host cell: The term “host cells,” denote, for example, micro-organisms, insect cells, and mammalian cells, which can be, or have been, used as recipients for recombinant vector or other transfer DNA, and include the progeny of the original cell which has been transformed. It is understood that the progeny of a single parental cell may not necessarily be completely identical in morphology or in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate mutation. Specific examples of mammalian cells and insect cells include human-derived cells, mouse-derived cells, fly-derived cells, silk worm-derived cells, and the like. Also, microorganisms such as Escherichia coli and yeast may be used.

Yeast: Yeasts include e.g. the following genera Candida, Kluyveromyces, Pichia, Saccharomyces, Schizosaccharomyces, Yarrowia, Acremonium, Aspergillus, Aureobasidium, Cryptococcus, Filibasidium, Fusarium, Humicola, Magnaporthe, Mucor, Myceliophthora, Neocallimastix, Neurospora, Paecilomyces, Penicillium, Piromyces, Schizophyllum, Talaromyces, Thermoascus, Thielavia, Tolypocladium, or Trichoderma. Saccharomyces species include S. carlsbergensis, S. cerevisiae, S. diastaticus, S. douglasii, S. kluyveri, S. norbensis, and S. oviformiss. Aspergillus species include A. aculeatus, A. awamori, A. foetidus, A. japonicus, A. nidulans, A. niger, A. terreus (the genome has been sequenced), A. flavus (the genome has been sequenced), A. fumigatus (the genome has been sequenced), and A. oryzae. Fusarium species include F. bactridioides, F. cerealis, F. crookwellense, F. culmorum, F. graminearum, F. graminum, F. heterosporum, F. negundi, F. oxysporum, F. reticulatum, F. roseum, F. sambucinum, F. sarcochroum, F. sporotrichioides, F. sulphureum, F. torulosum, F. trichothecioides, and F. venenatum. Other yeast species include e.g. Humicola insolens, Humicola lanuginosa, Mucor miehei, Myceliophthora thermophila, Neurospora crassa, Penicillium purpurogenum, Trichoderma harzianum, Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma reesei, and Trichoderma viride.

Promoter: The terms “promoter”, “promoter region” or “promoter sequence” refer generally to transcriptional regulatory regions of a gene, which may be found at the 5′ or 3′ side of the coding region, or within the coding region, or within introns. As used herein the term promoter shall include any portion of genomic DNA (including genomic DNA disclosed herein), which is capable of initiating transcription of nucleotide sequences at levels detectable above background. Examples of suitable promoters for directing the transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes for Aspergillus oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, and Aspergillus nidulans glyceraldehyde 3-phosphate dehydrogenase (gpdA) and Fusarium oxysporum trypsin-like protease, as well as the NA2-tpi promoter (a hybrid of the promoters from the genes for Aspergillus niger neutral alpha-amylase and Aspergillus oryzae triose phosphate isomerase), and mutant, truncated, and hybrid promoters thereof. It follows that the endogenous promoters may likewise be employed.

Expression vector: A vector is a component or composition for facilitating cell transduction or transfection by a selected nucleic acid, or expression of the nucleic acid in the cell. Vectors include, e.g., plasmids, cosmids, viruses, BACs, PACs, P1, YACs, bacteria, poly-lysine, as well as linear nucleotide fragments etc. An “expression vector” is a nucleic acid construct or sequence, generated recombinantly or synthetically, with a series of specific nucleic acid elements that permit transcription of a particular nucleic acid sequence in a host cell. The expression vector can be part of a plasmid, virus, or nucleic acid fragment. The expression vector typically includes a nucleic acid to be transcribed operably linked to a promoter. The nucleic acid to be transcribed is typically under the direction or control of the promoter. The expression vector may replicate autonomously in the host cell or may integrate into the host genome after the transfection or transduction and replicate as part of the genome.

Sequence identity: The term “sequence identity” is a measure of the degree of identity between polynucleotide sequences on a nucleotide-by-nucleotide basis or amino acid-by-amino acid basis, respectively) over a window of comparison.

EXAMPLES Example 1 P. brevicompactum BAC Library

P. brevicompactum, strain IBT 23078, was obtained from the strain collection at Center for Microbial Biotechnology at the Technical University of Denmark. Chromosomal DNA was extracted from this strain. Amplicon Express subsequently constructed a 10 fold coverage BAC library on basis of the chromosomal P. brevicompactum DNA (PBBAC). The total number of clones in the library was 3,072.

Example 2 Screening of PBBAC Using IMPDH Primers

A first approach in the attempt of isolating the MPA gene cluster was to screen the library for PKS enzymes using degenerated primers designed on basis of various conserved PKS domains. Several PKS gene fragments from genomic DNA were amplified with these primers and sequenced. However, based on alignments none of these gene fragments belonged to the non-reducing PKS with methyltransferase activity, as needed for MPA PKS. Hence, the gene fragments could not be used as probes for MPA PKS in PBBAC.

The second approach was to find out if P. brevicompactum encodes more than one IMPDH gene and if that was the case, then hopefully the MPA gene cluster could be found as neighbouring sequences to one of these IMPDH gene copies. The background for this hypothesis being that the extra IMPDH gene copy may be the prerequisite for the existence of an enzymatic pathway that leads to synthesis of a compound (MPA) that inhibits the very activity of IMPDH. Perhaps this possible coexistence is reflected by the genomic structure resulting in a close physical proximity of the MPA gene cluster and the extra copy of the IMPDH gene.

IMPDH is a highly conserved protein and degenerate IMPDH primers were designed on basis of conserved domains of the protein. The IMPDH primers that were used for amplification of MPA cluster specific probes are shown in table 1 below:

TABLE 1  Degenerate IMPDH gene primers. Name Sequence (degeneracy) SEQ ID NO: IMP_FW^(a) G   G   L   T   Y   N   D  [F] 17 IMP_FW^(b, c) GGI GGI YTI ACI TAY AAY GAY TT (16)c 18 IMP_RV^(a) G N V V T R E Q A[A] 19 IMP_RV^(b, c) GC IGC YTG YTC ICK IGT IAC IAC RTT ICC 20 (16) ^(a)amino acid sequence ^(b)Letters in bold indicate degenerate nucleotides using the standard letter code. ^(c)Inosine was used as a non-degenerate nucleotide analogue in order to reduce the redundancy.

A 1115 by amplification product was obtained with the IMP_FW/RV primers. This fragment was used as a probe to screen the PBBAC library.

As the coverage of the PBBAC library was about 10 fold, a single copy of the IMPDH gene should yield approximately 10 hybridization signals, and two copies of the IMPDH gene would result in approximately 20 hybridization signals. Extensive experiments indeed indicated the existence of two IMPDH genes in P. brevicompactum genome, as 24 hybridization signals were found. This observation strongly indicated that P. brevicompactum obtained resistance against MPA by having an extra copy of the IMPDH gene. This mechanism (overexpression of IMPDH) is similar to the MPA resistance mechanism observed in Candida albicans.

The following five IMPDH BACs were selected for further analyses: 1-B12, 1-E13, 1-C23, 1-B16, 1-H11, and 1-13. Depending on the hybridization pattern, these clones could be subdivided into the following groups:

-   -   a) 1-B12, 1-B16, 1-H11, and     -   b) 1-E13 and 1-C23     -   Sequence and blast analysis revealed that the neighbouring         sequence in group a) was a ras GTPase activating protein. In         connection with the present invention, the inventors had used         blast searches of available fungal genomes to establish the         number of IPMDH genes. They found that only one IMPDH gene was         present in these organisms. Further blast searches revealed that         IMPDH was located in close proximity to ras GTPase activator         protein in Neurospora crassa, Magnaporthe grisea, A. oryzae,         and A. terreus). This result indicated that the group a) BACs         encoded the “standard” IMPDH gene in P. brevicompactum.

It was thus hypothesized that the group b) clones would encode the extra IMPDH copy that would hopefully be located in the MPA gene cluster or close to it. However, initial sequence analysis of the b) clones did not succeed—probably due to the large size of the clone.

Example 3 Sequencing of the MPA Gene Cluster

The process of sequencing the BAC clones suspected to contain the MPA gene cluster was outsourced to MWG Biotech. The company constructed a shotgun library of the BAC with an average insert size of app. 2-3 kb followed by random picking of a number of clones for end-sequencing. The size of the BAC insert was estimated to be app. 100 kb.

The sequence returned from MWG Biotech was assembled into four large contigs. These were separated by gaps that were later closed by sequencing. The annotation of BAC 1-C23 showed that 5 ORFs (designated mpaD to mpaH in FIG. 1) had similarity to polyketide biosynthesis genes. However, no putative PKS genes could be identified, as the gene cluster was located very near the end of the insert. Chances were thus that the remaining part of the MPA gene cluster could be found in another BAC. “BAC walking” subsequently indeed allowed identification of the remaining part of the MPA gene cluster.

FIG. 1 shows a schematic representation of the MPA gene cluster

Example 4 Analysis of the Genes in the MPA Gene Cluster

Several of the MPA genes shown in FIG. 1 (mpaA-mpaH) have amino acid sequence homology with proteins previously shown to be involved in polyketide biosynthesis. The fragment of mpaD that was present on BAC 1-C23 was e.g. 36% identical in 192 amino acids to a cytochrome P450 involved in pisatin demethylation in Nectria haematococca. MpaE was 32% identical in 84 aa to AhlD, which is a zinc dependent hydrolase in Arthrobacter sp. MpaF was 62% identical in 524aa to IMPDH from Candida dubliniensis, and MpaG 30% identical in 374 aa to an Oxygen-methyl transferase B of Hypocrea virens. MpaH has weak similarity to an α/β-hydrolase fold 1 protein family.

TABLE 2 Analysis of genes in the MPA biosynthesis gene cluster SEQ ID Putative Size Predicted domains¹ and Closest characterized homologue NO: Enzyme activity [aa] features Protein Organism Similarity 1 MpaA Prenyl 316 7 transmembrane (XP_746965.1) 4- Aspergillus 44% in transferase helices² hydroxybenzoate fumigatus 308 aa Pfam: UbiA octaprenyltransferase prenyltransferase family 2 MpaB Unknown 423 TypeIII reverse signal Put. dephospho-CoA Synechococcus 30% in function membrane anchor³ kinase sp. 182 aa Pfam: None (ZP_01083610) 3 MpaC Polyketide 2487 Pfam⁶: KS, AT, PP, Citrinin PKS Monascus 32% in synthase MT, Esterase⁴ (dbj|BAD44749.1) purpureus 2125 aa 4 MpaD P450 535 Possible membrane Pisatin demethylase Nectria 30% in monooxygenase anchor² (P450) haematococca 555 aa Pfam: Cytochrome (gb|AAC01762.1|) P450 5 MpaE Zn 261 Pfam: Metallo-beta- AhlD (Zn dep. Arthrobacter 32% in 84 dependent lactamase superfamily hydrolase) sp. aa hydrolase II (gb|AAP57766.1|) 6 MpaF Inosine 527 Pfam: IMPDH IMPDH Candida 62% in monophosphate (gb|AAW65380.1|) dubliniensis 524 aa dehydrogenase 7 MpaG O-methyl- 398 Pfam: O-MT: SAM- O-methyl transferase B Hypocrea 30% in transferase binding motif and (gb|ABE60721.1|) virens 374 aa catalytic residues 8 MpaH Hydrolase 433 Pfam: M-factor Akt2 (AK-toxin Alternaria 20% in Weak similarity to α/β- synthesis) alternate 255 aa hydrolase fold 1 (dbj|BAA36589.1|) ¹By similarity to domains in the Pfam database ²Predicted using Phobius; accessible on the world wide web at phobius.cgb.ki.se. ³Predicted using SignalP3.0; accessible on the world wide web at www.cbs.dtu.dk. ⁴Predicted using CDD at NCBI. accessible on the world wide web at www.ncbi.nlm.nih.gov[[/]] ⁵The GENSCAN program only predicts one intron resulting in a 548 amino acid protein. The NetAspGene 1.0 prediction server (accessible on the world wide web at www.cbs.dtu.dk) predicts two introns, which results in a 527 amino acid protein that yields an improved blastp result. ⁶KS: Ketoacylsynthase, AT: acyltransferase, PP: phosphopantheteine attachment site, MT: methyltransferase.

The closest characterized homologues in Table 2 were identified by a blastx search in public sequences with the P. brevicompactum DNA sequences. The column “closest characterized homologue” lists the functionally characterized proteins with the highest similarities to the MPA biosynthesis genes. Although, there were putative genes from Aspergillus spp. with higher similarities to the query sequence than the characterized homologues listed in Table 2, these were not included in the table as they do not add any information as to the function of the MPA biosynthesis genes.

As seen from Table 2, eight putative genes were identified of which only one (mpaB) encoded an enzyme with a completely unknown function. In the following examples, all the enzymes will be analyzed and discussed in detail with respect to their catalytic function in the MPA biosynthesis.

Example 5 mpaA (Encodes a Putative Prenyl Transferase; SEQ ID NO: 1)

mpaA (SEQ ID NO: 9) encodes a putative polypeptide (SEQ ID NO: 1) that contains a conserved domain that most likely belongs to the UbiA prenyltransferase family. ubiA encodes a 4-hydroxybenzoate oligoprenyltransferase in E. coli, and is an important key enzyme in the biosynthetic pathway to ubiquinone. It has been shown to catalyze the prenylation of 4-hydroxybenzoic acid in position 3, which is similar in mechanism to the prenylation of the 5,7-dihydroxy-4-methylphtalide in the MPA biosynthesis (FIG. 7).

The enzymatic activity of MpaA is required at step 6 in MPA biosynthesis for the transfer of farnesyl to the dihydroxyphtalide (FIG. 7). Proteins in the UbiA-family contain seven transmembrane segments and the most conserved region is located on the external side in a loop between the second and third of these segments. Thus, if MpaA is a UbiA-family protein it should be bound to a membrane, and have its active site on the correct loop on the external side of the membrane. An analysis of MpaA using the transmembrane domain predictor, Phobius, resulted in the pattern illustrated in FIG. 5.

The result in FIG. 5 strongly indicates that 7 transmembrane helices are present in MpaA as expected for an UbiA family protein. The active site was identified by searching for the active site consensus pattern characteristic for the UbiA prenyltransferase family. The UbiA phenyltransferase family has previously been characterized in Brauer et al. (Journal of Molecular Modelling 10[5-6], 317-327. 2004).

The amino acid active site consensus pattern for an UbiaA family protein is given by:

(SEQ ID NO: 36) N-x(3)-[DEH]-x(2)-[LIMFYT]-D-x(2)-[VM]-x-R-[ST]-x(2)-R-x(4)-[GYNKR]. Identified motif in MpaA (residue 91 to 113 of SEQ ID NO: 1): N-dlv-D-rd-I-D-ar-V-a-R-T-km-R-plas-G.

For the active site consensus pattern counts:

-   -   Capital letter: The only amino acid allowed in a given position.     -   Capital letters in [ ]: Allowed amino acids in a given position.     -   x(#): Number of residues where all amino acids are allowed.

The identified active site was in accordance with Brauer et al. The active site was correctly positioned between the second and third of the transmembrane segments on a loop on the external side of the membrane.

Based on sequence similarity between different prenyltransferases Brauer et al. hypothesized that the active site is on the outside of the membrane linked to the hydrophilic diphosphate of the diphosphatefarnesyl, which has its hydrophobic acyl chain buried in the membrane.

To further substantiate the notion that MpaA is a transmembrane protein, an analysis of the myristoylation pattern was carried out as the hydrophobic acyl chains of myristoyl groups have been shown to target proteins to membranes. The myristoylation site consensus pattern is described below:

Myristoylation site consensus pattern: (SEQ ID NO: 37) G-{EDRKHPFYW}-x(2)-[STAGCN]-{P}

The same rules apply here as for the prenylation active site consensus pattern described above. In addition, letters in { } are not allowed in the given position. In the myristoylation site, it is the first G which is being myristoylated.

The analysis revealed three N-myristoylation sites in MpaA, two of which were positioned at residues 85-92, very close to the active site:

TABLE 4 Identified myristoylation sites in MpaA (SEQ ID NO: 1) Sequence matching Residues consensus sequence SEQ ID NO 85-90 GAgnTW 38 87-92 GNtwND 39 155-160 GLaiGY 40

It is probably only one of the myristoylation sites at residues 85-90 and 87-92, which is myristoylated. It may be speculated that the presence of myristoylation sites immediately prior to the prenylation active site (residues 91-113) may function as anchor points of the prenyl transferase to the membrane, thereby ensuring that the active site is localized in direct proximity of the prenyl-chain in the membrane.

The amino acid sequence spanning position 14-301 in SEQ ID NO: 1 shares 46% identity with the corresponding portion of the closest related amino acid sequence present in the database (EAW19988.1). Sequences relating to the present invention are thus at least 50% identical, preferably at least 55% identical, more preferably at least 60% identical, more preferably at least 65% identical, more preferably at least 70% identical, more preferably at least 80% identical, more preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 14-301 in SEQ ID NO: 1.

Example 6 mpaB (Encodes a Polypeptide with Unknown Function; SEQ ID NO: 2)

mpaB (SEQ ID NO: 10) encodes a putative protein of 423 amino acids with an unknown function (SEQ ID NO: 2). The most similar characterized protein is a dephospho-CoA kinase, with 30% similarity (Table 2). A putative signal targeting MpaB to membranes could be detected using SignalP3.0 software. No cleavage signal was predicted with SignalP3.0 software, and the protein is thus presumably not released from but rather anchored to the membrane. It is uncertain whether the targeting is directed towards the cytoplasmic membrane or towards intracellular membrane-contained organelles. The program predicts a hydrophilic N-terminal region, followed by a hydrophobic (H-) region that spans the membrane. The stretch of positively charged amino acid residues on the C-terminal side of the H-region indicates that this side is inside and the N terminal region of the protein is situated outside. This was confirmed by a prediction using the Phobius software.

Position 22-422 in SEQ ID NO: 2 shares 61% identity with the corresponding portion from closest related amino acid sequence available in the database (EAW07745.1). Sequences according to the preset invention thus share at least 70% identity, preferably at least 75% identity, preferably at least 80% identity, preferably at least 85% identity, more preferably at least 90% identity, and most preferably at least 95% identity with position 22-422 in SEQ ID NO: 2.

Example 7 mpaC (Encodes a Putative Polyketide Synthase (PKS); SEQ ID NO: 3)

mpaC (SEQ ID NO: 11) encodes a novel putative multifunctional type I PKS (SEQ ID NO: 3) with a GENSCAN-predicted size of 2487 aa (265 kDa). Four putative introns were identified ranging from 62 to 259 nucleotides. The enzyme shows strong similarity to other PKSs and share 32% similarity in 2125 aa to the citrinin PKS from Monascus purpureus, which is the characterized PKS with the highest similarity to MpaC (SEQ ID NO: 3). Two putative PKSs from A. nidulans and A. terreus share 45% similarity with MpaC in 2509 and 2375 aa, respectively. Several motifs could be detected by analyzing the amino acid sequence using the Conserved Domain Database (CDD) at the National Center for Biotechnology Information (NCBI) (FIG. 2).

All the domains necessary for a functional PKS were detected with the CDD analysis, namely the KS, AT, and PP domains (FIG. 2). As MPA is an unreduced polyketide it was consistent with the expectations that no reducing domains were identified in the CDD analysis. In addition, an MT domain was identified also in accordance with the biosynthesis which includes a methylation at the tetraketide stage. The MT domain was similar in primary structure to other MT domains identified from other PKSs like the lovastatin PKSs (LNKS, LDKS) from A. terreus and compactin PKSs (MlcA, MlcB) from Penicillium solitum (Table 3).

TABLE 3 MT domains from different fungal PKSs PKS Uniprot ID Specie MT Residues MlcA dbj|BAC20564.1 P. solitum 1395 . . . 1597 MlcB dbj|BAC20566.1 P. solitum 1461 . . . 1590 LNKS sp|Q9Y8A5 A. terreus 1417 . . . 1553 LDKS gb|AAD34559.1 A. terreus 1431 . . . 1557 MpaC Not assigned P. brevicompactum 1923 . . . 2075 (SEQ ID NO: 3)

The residues of MlcA, MlcB, LNKS, and LDKS, belonging to the MT domains were given in the Uniprot database, and for MpaC the residues were identified in the CDD analysis in FIG. 2. In order to confirm the CDD result concerning the MT domain of MpaC, the MT domains listed in 3 were aligned and the result is presented in FIG. 3.

In FIG. 3, three motifs designated Motif I to III which are known to be present in most PKS MT domains, could also be identified in MpaC. The biosynthesis of MPA has been shown to involve a methylation of the tetraketide with S-adenosyl methionine as methyl donor. Hence, the presence of the MT domain is consistent with this finding, as well as the lack of reducing domains is consistent with the fact that MPA is an unreduced polyketide.

The esterase in MpaC is not homologous to any characterized thioesterases. The domain contains the α/β-hydrolase fold and is most similar to carboxylic acid esterases, which by the addition of water cleaves the carboxylic acid ester into the acid and an alcohol. The domain has similarity to the Aes protein from E. coli, which has been shown to hydrolyze p-nitrophenyl acetate into acetate and p-nitrophenol. Only the cleavage of the thioester between the tetraketide and the PKS requires a similar catalytic activity, and it is therefore likely that the esterase domain is involved in this step.

The amino acid sequence spanning position 10-2487 of SEQ ID NO: 3 shares 49% identity with the corresponding portion of the closest related amino acid sequence present in the database (EAA67005.1). Sequences relating to the present invention are thus at least 50% identical, preferably at least 55% identical, more preferably at least 60% identical, more preferably at least 65% identical, more preferably at least 70% identical, more preferably at least 80% identical, more preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 10-2487 in SEQ ID NO: 3.

Example 8 mpaD (Encodes a Putative p450 Monooxygenase; SEQ ID NO: 4)

mpaD (SEQ ID NO: 12) contains three introns and encodes a putative P450 monooxygenase (CDD and Pfam) of 535 amino acids (SEQ ID NO: 4). The protein contains a 10 amino acids long N-terminal H-region, which may function as a membrane anchor. SignalP3.0 predicts MpaD to be a signal protein with cleavage site after residue 25 (Signal Probability=0.61; Anchor probability=0.35; Data not shown). However, the protein is probably not secreted as the most likely putative function of the protein is oxidation of an MPA intermediate at step 5 in FIG. 7.

The amino acid sequence spanning position 24-502 of SEQ ID NO: 4 shares 54% identity with the corresponding portion of the closest related amino acid sequence present in the database (BAE65443.1). Sequences relating to the present invention are thus at least 60% identical, more preferably at least 65% identical, more preferably at least 70% identical, more preferably at least 80% identical, more preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 24-502 in SEQ ID NO: 4.

Example 9 mpaE (Encodes a Putative Hydrolase; SEQ ID NO: 5)

mpaE (SEQ ID NO: 13) encodes a putative hydrolase (COG1237: Metal dependent hydrolases of beta-lactamase superfamily II; Pfam: Metallo-beta-lactamase superfamily) of 261 amino acids (SEQ ID NO: 5). It is uncertain how many introns, if any, the gene contains as the predicted protein is based solely on the blastx result and there is no obvious startcodon based on the similarity to other proteins. Consequently, it is also impossible to predict whether or not this protein contains any signals targeting it to a specific cellular structure as these usually are localized in the C-terminal end of the protein. It is difficult to assign the putative function of MpaE as several proteins contain the lactamase domain, but none with a function that is obvious in the MPA biosynthesis. Certain thioesterases and glyoxylases contain the metallo-beta-lactamase domain, and therefore it is possible that MpaE functions as a thioesterase that cleaves the thioester linking the polyketide chain to the PKS.

The amino acid sequence spanning position 1-255 of SEQ ID NO: 5 shares 49% identity with the corresponding portion of the closest related amino acid sequence present in the database (EAT86512). Sequences relating to the present invention are thus at least 50% identical, preferably at least 55% identical, more preferably at least 60% identical, more preferably at least 65% identical, more preferably at least 70% identical, more preferably at least 80% identical, more preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 1-255 in SEQ ID NO: 5.

Example 10 mpaF (Encodes a Putative IMPDH; SEQ ID NO: 6)

mpaF (SEQ ID NO: 14) encodes a putative IMPDH protein (SEQ ID NO: 6)

The amino acid sequence spanning position 3-526 of SEQ ID NO: 6 shares 81% identity with the corresponding portion of the closest related amino acid sequence present in the database (BAE62832.1). Sequences relating to the present invention are thus at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 10-2487 in SEQ ID NO: 3.

Example 11 mpaG (Encodes a Putative O-Methyltransferase; SEQ ID NO: 7)

mpaG (SEQ ID NO: 15) encodes a putative protein of 398 residues (43.1 kDa) and contains one intron (GENSCAN; NetAspGene 1.0) (SEQ ID NO: 7). The protein is 30% identical in 347 aa to viridin O-methyltransferase from Hypocrea virens and 45% identical in 403 aa to a hypothetical protein from Gibberella zeae. The predicted domain belongs to a group of O-methyltransferases that utilize SAM as methyl donor. The structure has been determined of the related enzyme, caffeic acid-O-methyltransferase (C-O-MT), which catalyzes the methylation of the following lignin monomers in plants: caffeate, caffeoyl alcohol, caffeoyl aldehyde, 5-hydroxyferulate, 5-hydroxyconiferyl alcohol and 5-hydroxyconiferyl aldehyde. By comparing the deduced MpaG amino acid sequence to O-MT proteins with similar functions, it was possible to estimate if the required domains are present in MpaG. The selected sequences for this purpose are listed in Table 5.

TABLE 5 O-MT proteins used for alignment with MpaG Protein Uniprot ID Species MpaG (SEQ ID NO: 7) Not assigned P. brevicompactum O-MT B gb|ABE60721.1| Hypocrea virens O-MT B gb|AAS66016.1| A. parasiticus Hyp. O-MT¹ gb|EAA69894.1| Gibberella zeae Caffeoyl-O-MT (C-O-MT) gb|AAB46623.1| Medicago sativa ¹Hyp. O-MT: hypothetical O-MT - was identified in the annotation of the MPA gene cluster, where it was the blastx hit with the highest score to mpaG.

In the alignment of the sequences from Table 5 the first 90 residues were omitted as seen in FIG. 6.

The O-MT B protein of H. virens seems to be involved in antibiotic production and the O-MT B from A. parasiticus is involved in aflatoxin production.

The proteins have locally conserved domains such as the SAM binding site and certain catalytic residues. However, apart from those conserved domains, the proteins are very diverse which is consistent with the fact that the substrates of the enzymes structurally are very different.

The amino acid sequence spanning position 5-397 of SEQ ID NO: 7 shares 45% identity with the corresponding portion of the closest related amino acid sequence present in the database (XP_(—)382791.1). Sequences relating to the present invention are thus at least 50% identical, preferably at least 55% identical, more preferably at least 60% identical, more preferably at least 65% identical, more preferably at least 70% identical, more preferably at least 80% identical, more preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 5-397 in SEQ ID NO: 7.

Example 12 mpaH (Encodes a Putative Hydrolase; SEQ ID NO: 8)

mpaH (SEQ ID NO: 16) encodes a putative protein of 433 amino acids and minimum two introns as predicted with NetAspGene 1.0 and blastx similarity searches (SEQ ID NO: 8). The protein is 20% identical in 255aa to Akt2 and has a weak similarity to an M-factor domain (Pfam analysis: E-value=0.12) and a hydrolase 1 domain (Pfam analysis: E-value=0.9). MpaH is 35% similar to a hypothetical protein from A. fumigatus in 448 amino acids, which is a putative toxin biosynthesis protein due to its similarity to Akt2. Akt2 has an unknown function in the biosynthesis of the AK-toxin 2, produced by a Japanese pear specific variant of Alternaria alternata. These proteins contain a hydrolase domain with unknown substrate specificity. Thus, the most likely catalytic function in the MPA synthesis is hydrolysis of the farnesyl sidechain at step 7, yielding demethylmycophenolic acid.

The amino acid sequence spanning position 1-420 of SEQ ID NO: 8 shares 69% identity with the corresponding portion of the closest related amino acid sequence present in the database (CAK48380.1). Sequences relating to the present invention are thus at least 75% identical, more preferably at least 80% identical, more preferably at least 85% identical, more preferably at least 90% identical, and most preferably at least 95%% identical with position 10-2487 in SEQ ID NO: 8.

Example 13 MPA Biosynthesis in P. brevicompactum in Relation to the MPA Gene Cluster

In the MPA biosynthesis a tetraketide backbone aromatic ring and a farnesylgroup are fused, but only the genes necessary for the polyketide structure and postmodifications are found within the identified gene cluster. The farnesyl-CoA is produced by the normal mevalonate pathway in the fungus. The MPA biosynthesis with enzymes identified in this study assigned to each reaction step is presented in FIG. 7.

The tetraketide product of step 1 in FIG. 7 is catalyzed by MpaC, that belongs to a group classified as “fungal non-reducing methylating PKS”. The methylation of C-4 at step 3 in FIG. 7 occurs after the tetraketide has been synthesized, as the two neighbouring carbonyl groups at C-3 and C-5 activate the central methylene, and thereby yielding it more reactive for methylation. MpaC contains only one PP domain and may or may not contain a cyclase domain. The predicted esterase domain at the N-terminal end of the protein may catalyze the cyclization, aromatization and release of the polyketide from the PKS. Thioesterases, which belong to the same family of proteins, have previously been reported to be involved in chain-length determination, cyclization and lactonization (Fujii et al., 2001a). However, the esterase in MpaC is not homologous to any characterized thioesterases but may well belong to a new group of fungal cyclization domains. Thus, it is listed at step 4 in FIG. 7 that the esterase domain of MpaC catalyzes the cyclization, aromatization and cleavage of the thioester linkage between the polyketide and the PKS. As one may notice from FIG. 7, 5-methylorsellinic acid, which is the first stable intermediate from the MPA biosynthesis, does not contain the lactone group. Hence, the PKS does not catalyze the lactonization but only cyclization, the following enolization and release of the polyketide from the PKS at step 4.

For lactonization to occur at step 5, the C-3-methyl group must be oxidized to the alcohol, which is a reaction often catalyzed by P450 monooxygenases. In the gene cluster, only MpaD has similarity to a P450 monooxygenase. It has been reported that the 3,5-dihydroxyphtalic acid was produced by P. brevicompactum, which is probably derived from orsellinic acid. Orsellinic acid methyl group oxidized to carboxylic acid yields 3,5-dihydroxyphtalic acid. As the oxidations of the C-3 methyl group of MPA and orsellinic acid mechanistically are very similar, MpaD is likely to catalyze both reactions. MpaD has a possible membrane anchor domain linking the reaction to an intracellular organelle. This corresponds well to the fact that the prenyltransferase, MpaA, which catalyzes the subsequent reaction (step 6) is membrane bound with seven transmembrane hydrophobic regions. The P450 converts the 5-methylorsellinic acid to the phtalide in close proximity to the prenyltransferase, which then adds the farnesyl side chain to the aromatic ring. It is hypothesized, that a myristoylation site in close proximity to the active site of MpaA when myristoylated functions as an anchor point of the protein to the membrane. In this way, the active site is maintained close to the farnesyl pyrophosphate, which is buried in the membrane.

The step following prenylation in the MPA biosynthesis is an oxidation of either the terminal or central double bond of the farnesyl chain (step 7). The mechanism has been reported to include an epoxidation of the double bond, followed by hydrolysis. The hydrolysis may be catalyzed by MpaE or MpaH, which both have similarities to hydrolases. MpaE, however, has similarity to a Metallo-β-lactamase, AhlD, which is involved in the degradation of the lactone of N-acyl homoserine lactone. Thus, MpaE is not thought to be involved in the hydrolysis of the farnesyl-chain. MpaH, on the other hand, has certain similarity to a Pfam category, α/β-hydrolase fold 1, which includes the enzyme class of epoxide hydrolases. Hence, MpaH is more likely to hydrolyze the epoxide intermediate than MpaE. As the prenylation of the phthalide occurs in the microsomal membranes, one may speculate that the hydrolysis of the farnesyl-chain also takes place in a microsomal membrane. The enzyme MpaB (Table 2) contains a putative membrane anchor and could thus also be involved in the farnesyl double bond oxidation. However, no putative hydrolytic or oxidative domains were detected by conserved domain analyses, which is the reason why this function is not assigned to MpaB.

The final step in the MPA biosynthesis is methylation of the 5-hydroxyl group, which is catalyzed by MpaG, the only O-methyltransferase in the MPA biosynthesis gene cluster (Table 2).

When describing gene clusters responsible for the production of secondary metabolites, it is always worthwhile investigating the factors that potentially initiate the production, which for example is the case for MlcR in the compactin gene cluster. However, no such transcription factors could be identified within the MPA biosynthesis gene cluster, and so the regulation must be further elucidated by correlating the transcription profiles at different media and conditions with the MPA production. However, such studies of the MPA production have already demonstrated that MPA is produced during growth and not only during the stationary phase where most other secondary metabolites are produced. Thus, the question is if there are any conditions where the strain does not produce MPA and if any regulation of the MPA biosynthesis genes in P. brevicompactum is existing.

In the MPA gene cluster it is only MpaB (SEQ ID NO: 2), MpaE (SEQ ID NO: 5) and MpaH (SEQ ID NO: 8) which cannot be assigned a specific role in the biosynthesis or resistance mechanism. However, most likely these enzymes are involved in the oxidation of the farnesyl chain or in an unresolved part of the resistance mechanism.

Example 14 The P. brevicompactum MPA Resistance Mechanism

P. brevicompactum produces MPA in order to achieve a competetive advantage over other organisms, which are inhibited by MPA. Hence, obviously P. brevicompactum needs to overcome the inhibitory effect from MPA. MPA inhibits the IMPDH-catalyzed conversion of IMP to XMP. In this reaction, IMP binding precedes that of nicotinamide adenine dinucleotide (NAD), and reduced nicotinamide adenine dinucleotide (NADH) is released prior to XMP. MPA binds to IMPDH after NADH is released but before XMP is produced and thus functions as an uncompetetive inhibitor.

The presence of this mechanism means that according to a preferred embodiment of the present invention, an additional IMPDH gene is present in the host cell, unless the host strain genome harbours several IMPDH copies and/or encode IMPDH copies that are fully or partly MPA-resistant. IMPDH “redundance” thus allows the host cell to grow despite the presence of the MPA which is produced in the host cell culture.

Example 15 Heterologous MPA Production

One or more expression vectors encoding one or more of the MPA synthesis enzymes from P. brevicompactum is/are inserted into a host cell. If the host cell is fully or partly MPA resistant, then it may be optional to insert IMPDH encoding sequences in the host cell. The host cell is preferably a fungal organism which is relatively easy to cultivate—such as e.g. yeast. The host cell could in principle be any cell, including a bacterial cell, a mammalian cell or a plant cell. However, in order to ensure correct post translational modification which may be vital for enzyme function, the invention works most efficiently in eukaryotic, preferably fungal organisms. For practical reasons, yeast is a preferred host cell since it is generally easy to cultivate on an industrial scale.

The host cell is inoculated into a suitable growth medium that may be liquid, semi-liquid or solid and incubated under suitable conditions such that MPA production takes place. After an appropriate incubation period, the MPA containing medium is harvested from the cell culture.

Example 16 Recovering of MPA

MPA is usually recovered from growth media by organic extraction followed by distillation and crystallization techniques.

Example 17 Improved MPA Yield in P. brevicompactum

The present invention can also be used to improve MPA yield in Penicillium spp. producing MPA naturally. In one embodiment, one or more regulatory sequences could be altered to obtain a stronger expression of one or more MPA enzymes. In another embodiment MPA production is increased by addition of additional MPA gene copies. In a third embodiment, it is envisaged that one or more of the natural MPA gene cluster promoters are stimulated to increase MPA biosynthesis and/or to obtain a constitutive MPA synthesis. In a fourth embodiment, the present invention can be carried using a fungal strain that contains increased amounts of the precursor(-s) “farnesyl diphosphate” and/or acetyl CoA. The invention may also be carried out by a mixture of these embodiments.

The advantages of using P. brevicompactum (or another fungus that naturally produces MPA) as a host cell for improved MPA yield are obvious:

-   -   i) It is hypothesized that the enzymes are subject to correct         post translational modification thus ensuring synthesis of         functional enzymes;     -   ii) It is more than likely that organisms with the capability of         producing MPA harbour several unidentified mechanisms aiding the         fungus in the MPA resistance, thus obtaining relatively stable         and reliable growth despite high MPA concentrations;     -   iii) Improved yield of MPA can be obtained with only minor         alterations of existing MPA production facilities and production         procedures.

The fungus may be used in the form of a spore suspension or in mycelial form. The solid substrate matrix is e.g. selected from wheat bran, rice bran, ragi flour, soya flour, cotton seed flour, wheat flour, rice flour, rice husk, or any mixture thereof. Preferred incubation conditions are moist and aerobic conditions ranging from 20-35° C. (preferably 25-30° C.) at 1-30 days (preferably 1-2 weeks). Any methods for culturing P. brevicompactum can be employed. Well known methods are described e.g. in U.S. Pat. No. 4,452,891.

MPA can subsequently be recovered by conventional procedures.

In the following examples (18-21) construction of mpaC deletion mutants are described.

Example 18 Construction of Gene Targeting Substrates

One way to determine whether mpaC, a putative PKS, is responsible for the biosynthesis of MPA, is to delete the gene from the genome and record the consequence on the MPA productivity. Hence, we constructed several mpaC deletion mutants which all showed much reduced MPA productivities. To construct the mpaC deletion strain, the bipartite gene targeting method was used and the hygromycin resistance gene (hph) was used as a selectable marker as illustrated in FIG. 8. Each part of the fragment of bipartite substrates consists of a targeting fragment and a marker fragment. In order to enhance the homologous recombination efficiency, approximately 2.7 kb of both upstream and downstream flanking regions of mpaC were used. The upstream (2.65 kb) and downstream (2.67 kb) sequences flanking mpaC were amplified from genomic DNA of P. brevicompactum IBT23078 using primer pairs KO-MpaC-UF (SEQ ID NO: 21)/KO-MpaC-URa (SEQ ID NO: 22) and KO-MpaC-DFa (SEQ ID NO: 23)/KO-MpaC-DR (SEQ ID NO: 24), respectively. The two fragments containing hygromycinB resistance cassette (HygB) were amplified from pAN7-1, a vector carrying the HygB cassette. The upstream 2/3 HygB cassette (1.72 kb) was amplified using primers Upst-HygF-b (SEQ ID NO: 25) and Upst-HygR-N (SEQ ID NO: 26), whereas the downstream 2/3 HygB cassette (1.64 kb) was amplified using primers Dwst-HygF-N (SEQ ID NO: 27) and Dwst-HygR-A (SEQ ID NO: 28). A schematic overview of the gene targeting method is illustrated in FIG. 8.

To obtain the first fragment of bipartite substrate, the upstream mpaC and upstream 2/3 HygB fragments were fused together by PCR using primers KO-MpaC-UF (SEQ ID NO: 21) and Upst-HygR-N (SEQ ID NO: 26). Similarly, the second fragment of bipartite substrate was generated by fusing the downstream 2/3 HygB and downstream mpaC fragments together using primers Dwst-HygF-N (SEQ ID NO: 27) and KO-MpaC-DR (SEQ ID NO: 24).

Primers used to generate bipartite PCR fragments and to investigate the targeting pattern are listed in table 6.

TABLE 6  List of primers used in this work. SEQ Primer name Sequence ID NO Upstream mpaC 1. KO-mpaC-UF GAGGTGACCGCTACGTGTGT 21 2. KO-mpaC-URa gatccccgggaattgccatgCGTGCTGCGATACTCATTGC 22 Downstream mpaC 3. KO-mpaC-DFa ggactgagtagcctgacatcGGTCGTAAGCCTTGGCTGTG 23 4. KO-mpaC-DR CCTACGCGGTTTCCTGAGTT 24 Hygromycin cassette H1. Upst-HygF-b catggcaattcccggggatcGCTGATTCTGGAGTGACCCAGAG 25 H2. Upst-HygR-N CTGCTGCTCCATACAAGCCAACC 26 H3. Dwst-HygF-N GACATTGGGGAATTCAGCGAGAG 27 H4. Dwst-HygR-A gatgtcaggctactcagtccCGTTGTAAAACGACGGCCAGTGC 28 Primers for checking targeting status 5. KO-mpaC-F1 cagacggcagacaaccgaga 29 6. KO-mpaC-Re3 TGGGCTCGTATTTGACTCCG 30 7. KO-2mpaC-UF GGACACACGTAGGCAATGAGT 31 8. KO-2mpaC-URa GGTGGCACCACAAGCTGTAT 32

Example 19 Transformation of P. brevicompactum IBT23078

Genetic transformation of P. brevicompactum IBT23078 was carried out according to a slightly modified version of the procedure described by Nielsen ML, Albertsen, L, and Mortensen, U H. 2005 in “Genetic stability of direct and inverted repeats in Aspergillus nidulans”, Journal of Biotechnology 118:S13. 21-hour-old fungal mycelium was used for protoplast preparation. All transformation experiments were performed with 2×10⁵ protoplasts in 200 μl transformation buffer. 1-2 μg of each purified fusion PCR fragments were used for transformation. Selection of transformants was done on selective minimal medium (MM) containing 1M sorbitol, 2% glucose and 300 μg/ml hygromycin. For the positive control experiment, P. brevicompactum IBT23078 was transformed with pAN7-1 plasmid carrying the HygB cassette. Several transformants were observed after 4-5 days of incubation at 25° C. Transformants were purified by streaking out spores to obtain single colonies on selective minimal medium containing 150 μg/ml hygromycin and incubated at 25° C. for 4-5 days. The resulting transformants were further purified twice on fresh selective medium. 20 purified transformants were selected for further investigation.

Example 20 Analysis of Transformants

Each purified transformant was three points inoculated on Yeast Extract Sucrose (YES) agar (20 g/L yeast extract, 150 g/L sucrose, 0.5 g/L MgSO₄.7H₂O, 0.01 g/L ZnSO₄.7H₂O, 0.005 g/L CuSO₄.5H₂O, 20 g/L agar) and incubated at 25° C. for 5 days. Total genomic DNA from each clone was isolated and the integration pattern of the HygB cassette was investigated by PCR and sequencing. For isolation of genomic DNA, 40-50 mg mycelia were taken from YES agar and transferred to 2 ml Eppendorf tubes containing steel balls (2×Ø 2 mm, 1×Ø 5 mm). The mycelium was frozen in liquid nitrogen and homogenized in a Mixer Mill for 10 min at 4° C. The resulting powder was used for genomic DNA extraction using FastDNA® Spin Kit for Soil (Qbiogene, Inc.).

In order to investigate the integration events, two PCR experiments were performed. Both PCR experiments were performed by using primer pairs in which one of the primers is located outside the homologous region and the other is located in the HygR cassette. FIGS. 9A and 9B showed the results from amplification of the upstream and downstream region of mpaC from the wild-type and some transformants. Out of 20 transformants, the following 9 transformants were found to be the correct mpaC deletion strains: MPA1-1, MPA1-3, MPA1-8, MPA2-3, MPA2-4, MPA2-5, MPA2-6, MPA2-7 and MPA2-9. The remaining 11 transformants must have appeared due to non-homologous integration. As expected, the wild-type and transformants derived from non-homologous recombination gave no PCR product when checked for integration at the mpaC locus.

An additional PCR reaction was performed to investigate the presence of mpaC in the transformants (FIG. 9C). Surprisingly, a 2.6 kb PCR product corresponding to 1/3 of mpaC was detected in all strains including the mpaC deletion strains. Therefore, PCR fragments analogous to those illustrated in FIGS. 9A and 9B of 4 mpaC deletion strains (MPA1-1, MPA1-3, MPA2-5 (not in FIG. 9A/B), MPA2-9 (not in FIG. 9A/B)) were further characterized by sequencing using primers located at both ends of each PCR fragments. Sequencing results confirmed that those strains were the correct mpaC deletion strains.

Example 21 Metabolites Analysis of mpaC Deletion Strains

Metabolites were extracted from both the parental strain and the mpaC deletion strains grown on YES agar at 25° C. for 5 days and investigated by HPLC. Six plugs (6 mm in diameter) were taken from each culture, transferred to a 2-ml vial and extracted with 1 ml ethyl acetate containing 0.5% (v/v) formic acid on an ultrasonication bath for 60 minutes. The ethyl acetate extract was transferred to a new vial and evaporated to dryness in a rotary vacuum concentrator (RVC; Christ Frees Drier, USA). The dried extracts were re-dissolved with 400 μl methanol ultrasonically (10 minutes) and filtered through 0.45-1 μm Minisart RC4 filter (Sartorius, Germany) into a clean vial before HPLC analysis.

The HPLC profile of the wild-type and some transformants are shown in FIG. 10. The relative amount of mycophenolic acid produced from all strains is shown in FIG. 10. Of the 20 strains tested, 35 to 64% reduction in MPA productivity was observed by exactly those 9 strains that were AmpaC. Exactly those 9 mutants that the PCR analyses verified as AmpaC, are identified in FIG. 11 with 35 to 64% lower MPA productivity as compared to the wild type. This confirms, that mpaC is involved in the MPA production in P. brevicompactum.

Based on both PCR and HPLC results, it is concluded that 9 strains (MPA1-1, MPA1-3, MPA1-8, MPA2-3, MPA2-4, MPA2-5, MPA2-6, MPA2-7 and MPA2-9) are the correct mpaC deletion strains and that the mpaC gene is involved in mycophenolic acid production. This result is clear despite the fact that the production of mycophenolic acid in those strains was not completely abolished, which corresponds with the PCR results shown in FIG. 9C indicating that mpaC is somehow still present in all of these strains. There may be several explanations for this phenomenon; P. brevicompactum might have more than one copy of the chromosome as known from Saccharomyces cerevisiae or heterokaryons between the deletion and non-deletion strains were formed during the transformation experiments. More likely, however, P. brevicompactum forms multikaryous protoplasts, i.e. protoplasts containing more than one nuclei of which only part of them are transformed during transformation. This explains well the obtained PCR fragments as well as the substantial reduction in MPA productivity.

In conclusion the performed experiments show that mpaC is a key gene involved in the production of MPA by P. brevicompactum. 

1. (canceled)
 2. An expression vector comprising a polynucleotide sequence that encodes a polypeptide, having an amino acid sequence that is at least 70% identical to the sequence set forth in SEQ ID NO: 3, and wherein said polypeptide is a polyketide synthase.
 3. A host cell comprising the expression vector of claim
 2. 4. The host cell of claim 3, wherein the cell is a fungus.
 5. The host cell of claim 3, wherein said cell is a Penicillium.
 6. The host cell of claim 3, wherein said cell is Penicillium brevicompaetum.
 7. The expression vector of claim 2, having an amino acid sequence that is at least 80% identical to the sequence set forth in SEQ ID NO: 3, and wherein said polypeptide is a polyketide synthase.
 8. A method of cultivating the host cell of claim 3 comprising: providing the host cell; and growing said host cell in a growth medium.
 9. The method of claim 8, further comprising recovering mycophenolic acid (MPA) from said growth medium.
 10. The method of claim 8, wherein said host cell is Penicillium brevicompactum.
 11. The expression vector of claim 2, further comprising a polynucleotide encoding a polypeptide of the sequence set forth in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7 and/or SEQ ID NO:
 8. 